Temporal Diierence Learning: a Chemical Process Control Application

نویسندگان

  • Scott Miller
  • Ronald J. Williams
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Temporal Diierence Learning in Continuous Time and Space

A continuous-time, continuous-state version of the temporal diier-ence (TD) algorithm is derived in order to facilitate the application of reinforcement learning to real-world control tasks and neurobi-ological modeling. An optimal nonlinear feedback control law was also derived using the derivatives of the value function. The performance of the algorithms was tested in a task of swinging up a ...

متن کامل

Analytical Mean Squared Error Curves in Temporal Diierence Learning

We have calculated analytical expressions for how the bias and variance of the estimators provided by various temporal diierence value estimation algorithms change with ooine updates over trials in absorbing Markov chains using lookup table representations. We illustrate classes of learning curve behavior in various chains, and show the manner in which TD is sensitive to the choice of its step-...

متن کامل

Evolutionary Algorithms for Reinforcement

There are two distinct approaches to solving reinforcement learning problems, namely, searching in value function space and searching in policy space. Temporal diierence methods and evolutionary algorithms are well-known examples of these approaches. Kaelbling, Littman and Moore recently provided an informative survey of temporal diierence methods. This article focuses on the application of evo...

متن کامل

Learning to Achieve Goals

Temporal diierence methods solve the temporal credit assignment problem for reinforcement learning. An important subproblem of general reinforcement learning is learning to achieve dynamic goals. Although existing temporal diierence methods, such as Q learning, can be applied to this problem, they do not take advantage of its special structure. This paper presents the DG-learning algorithm, whi...

متن کامل

Structural Measures for Games and Process Control in the Branch Learning Model

Process control problems can be modeled as closed recursive games. Learning strategies for such games is equivalent to the concept of learning innnite recursive branches for recursive trees. We use this branch learning model to measure the diiculty of learning and synthesizing process controllers. We also measure the diierence between several process learning criteria, and their diierence to co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995